Overview

Dataset statistics

Number of variables28
Number of observations31076
Missing cells0
Missing cells (%)0.0%
Duplicate rows672
Duplicate rows (%)2.2%
Total size in memory5.4 MiB
Average record size in memory182.0 B

Variable types

Numeric11
Categorical11
Boolean6

Warnings

Reason has constant value "CR" Constant
Is_year_start has constant value "False" Constant
Dataset has 672 (2.2%) duplicate rows Duplicates
Id has a high cardinality: 3827 distinct values High cardinality
Applied is highly correlated with ReceivedHigh correlation
Received is highly correlated with AppliedHigh correlation
logapplied is highly correlated with logreceivedHigh correlation
logreceived is highly correlated with logappliedHigh correlation
Year is highly correlated with ElapsedHigh correlation
Month is highly correlated with Week and 1 other fieldsHigh correlation
Week is highly correlated with Month and 1 other fieldsHigh correlation
Dayofyear is highly correlated with Month and 1 other fieldsHigh correlation
Elapsed is highly correlated with YearHigh correlation
True_False is highly correlated with Reason and 1 other fieldsHigh correlation
Reason is highly correlated with True_False and 14 other fieldsHigh correlation
Area is highly correlated with Reason and 1 other fieldsHigh correlation
Is_month_end is highly correlated with Reason and 1 other fieldsHigh correlation
Payment_Method is highly correlated with Reason and 2 other fieldsHigh correlation
Is_month_start is highly correlated with Reason and 1 other fieldsHigh correlation
Gender is highly correlated with Reason and 1 other fieldsHigh correlation
Is_year_start is highly correlated with True_False and 14 other fieldsHigh correlation
Age is highly correlated with Reason and 2 other fieldsHigh correlation
AgeGroup is highly correlated with Reason and 2 other fieldsHigh correlation
Location is highly correlated with Reason and 1 other fieldsHigh correlation
Payment_Type is highly correlated with Reason and 2 other fieldsHigh correlation
Year is highly correlated with Reason and 1 other fieldsHigh correlation
Is_year_end is highly correlated with Reason and 1 other fieldsHigh correlation
Is_quarter_end is highly correlated with Reason and 1 other fieldsHigh correlation
Is_quarter_start is highly correlated with Reason and 1 other fieldsHigh correlation
Ratio is highly skewed (γ1 = -90.54043879) Skewed
Dayofweek has 5491 (17.7%) zeros Zeros

Reproduction

Analysis started2021-04-26 20:46:14.102419
Analysis finished2021-04-26 20:47:12.496143
Duration58.39 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

Applied
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1901
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean771.3485648
Minimum3
Maximum13942
Zeros0
Zeros (%)0.0%
Memory size242.9 KiB
2021-04-27T02:17:13.165285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile150
Q1330
median560
Q31050
95-th percentile2000
Maximum13942
Range13939
Interquartile range (IQR)720

Descriptive statistics

Standard deviation607.652693
Coefficient of variation (CV)0.7877796378
Kurtosis9.271635327
Mean771.3485648
Median Absolute Deviation (MAD)300
Skewness1.710803489
Sum23970428
Variance369241.7953
MonotocityNot monotonic
2021-04-27T02:17:13.473317image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
400579
 
1.9%
300563
 
1.8%
500554
 
1.8%
600548
 
1.8%
450404
 
1.3%
200385
 
1.2%
700384
 
1.2%
350373
 
1.2%
1000372
 
1.2%
1200346
 
1.1%
Other values (1891)26568
85.5%
ValueCountFrequency (%)
31
< 0.1%
41
< 0.1%
61
< 0.1%
101
< 0.1%
121
< 0.1%
ValueCountFrequency (%)
139421
< 0.1%
70311
< 0.1%
60611
< 0.1%
51861
< 0.1%
49451
< 0.1%

Gender
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
F
19794 
M
11278 
GD
 
4

Length

Max length2
Median length1
Mean length1.000128717
Min length1

Characters and Unicode

Total characters31080
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowM
3rd rowM
4th rowM
5th rowF
ValueCountFrequency (%)
F19794
63.7%
M11278
36.3%
GD4
 
< 0.1%
2021-04-27T02:17:14.093195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T02:17:14.297648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
f19794
63.7%
m11278
36.3%
gd4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
F19794
63.7%
M11278
36.3%
G4
 
< 0.1%
D4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter31080
100.0%

Most frequent character per category

ValueCountFrequency (%)
F19794
63.7%
M11278
36.3%
G4
 
< 0.1%
D4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin31080
100.0%

Most frequent character per script

ValueCountFrequency (%)
F19794
63.7%
M11278
36.3%
G4
 
< 0.1%
D4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII31080
100.0%

Most frequent character per block

ValueCountFrequency (%)
F19794
63.7%
M11278
36.3%
G4
 
< 0.1%
D4
 
< 0.1%

Payment_Method
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
AV
26795 
RP
4281 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters62152
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRP
2nd rowAV
3rd rowAV
4th rowRP
5th rowAV
ValueCountFrequency (%)
AV26795
86.2%
RP4281
 
13.8%
2021-04-27T02:17:14.698851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T02:17:14.868147image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
av26795
86.2%
rp4281
 
13.8%

Most occurring characters

ValueCountFrequency (%)
A26795
43.1%
V26795
43.1%
R4281
 
6.9%
P4281
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter62152
100.0%

Most frequent character per category

ValueCountFrequency (%)
A26795
43.1%
V26795
43.1%
R4281
 
6.9%
P4281
 
6.9%

Most occurring scripts

ValueCountFrequency (%)
Latin62152
100.0%

Most frequent character per script

ValueCountFrequency (%)
A26795
43.1%
V26795
43.1%
R4281
 
6.9%
P4281
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII62152
100.0%

Most frequent character per block

ValueCountFrequency (%)
A26795
43.1%
V26795
43.1%
R4281
 
6.9%
P4281
 
6.9%

Location
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
M
13959 
NE
9240 
O
3391 
PP
3374 
U
 
1112

Length

Max length2
Median length1
Mean length1.405908096
Min length1

Characters and Unicode

Total characters43690
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNE
2nd rowPP
3rd rowM
4th rowNE
5th rowNE
ValueCountFrequency (%)
M13959
44.9%
NE9240
29.7%
O3391
 
10.9%
PP3374
 
10.9%
U1112
 
3.6%
2021-04-27T02:17:15.270190image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T02:17:15.432739image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
m13959
44.9%
ne9240
29.7%
o3391
 
10.9%
pp3374
 
10.9%
u1112
 
3.6%

Most occurring characters

ValueCountFrequency (%)
M13959
32.0%
N9240
21.1%
E9240
21.1%
P6748
15.4%
O3391
 
7.8%
U1112
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter43690
100.0%

Most frequent character per category

ValueCountFrequency (%)
M13959
32.0%
N9240
21.1%
E9240
21.1%
P6748
15.4%
O3391
 
7.8%
U1112
 
2.5%

Most occurring scripts

ValueCountFrequency (%)
Latin43690
100.0%

Most frequent character per script

ValueCountFrequency (%)
M13959
32.0%
N9240
21.1%
E9240
21.1%
P6748
15.4%
O3391
 
7.8%
U1112
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII43690
100.0%

Most frequent character per block

ValueCountFrequency (%)
M13959
32.0%
N9240
21.1%
E9240
21.1%
P6748
15.4%
O3391
 
7.8%
U1112
 
2.5%

Received
Real number (ℝ≥0)

HIGH CORRELATION

Distinct4633
Distinct (%)14.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean771.3352253
Minimum2.6
Maximum13941.5
Zeros0
Zeros (%)0.0%
Memory size242.9 KiB
2021-04-27T02:17:15.670673image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2.6
5-th percentile150
Q1330
median560
Q31050
95-th percentile2000
Maximum13941.5
Range13938.9
Interquartile range (IQR)720

Descriptive statistics

Standard deviation607.6522426
Coefficient of variation (CV)0.7877926779
Kurtosis9.270519592
Mean771.3352253
Median Absolute Deviation (MAD)300
Skewness1.710750861
Sum23970013.46
Variance369241.248
MonotocityNot monotonic
2021-04-27T02:17:15.934367image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
400576
 
1.9%
300559
 
1.8%
600544
 
1.8%
500543
 
1.7%
450399
 
1.3%
700383
 
1.2%
200380
 
1.2%
1000370
 
1.2%
350369
 
1.2%
1200345
 
1.1%
Other values (4623)26608
85.6%
ValueCountFrequency (%)
2.61
< 0.1%
4.121
< 0.1%
6.121
< 0.1%
101
< 0.1%
121
< 0.1%
ValueCountFrequency (%)
13941.51
< 0.1%
7031.361
< 0.1%
6060.51
< 0.1%
51861
< 0.1%
4944.771
< 0.1%

Id
Categorical

HIGH CARDINALITY

Distinct3827
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
GHI000137495
 
722
GHI000135471
 
657
GHI000084252
 
620
GHI001304576
 
465
GHI000151115
 
459
Other values (3822)
28153 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters372912
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1285 ?
Unique (%)4.1%

Sample

1st rowGHI000076584
2nd rowGHI000135471
3rd rowGHI000159249
4th rowGHI000844291
5th rowGHI000441861
ValueCountFrequency (%)
GHI000137495722
 
2.3%
GHI000135471657
 
2.1%
GHI000084252620
 
2.0%
GHI001304576465
 
1.5%
GHI000151115459
 
1.5%
GHI000140573457
 
1.5%
GHI000143720449
 
1.4%
GHI000877574424
 
1.4%
GHI000275983345
 
1.1%
GHI000076584303
 
1.0%
Other values (3817)26175
84.2%
2021-04-27T02:17:16.820804image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ghi000137495722
 
2.3%
ghi000135471657
 
2.1%
ghi000084252620
 
2.0%
ghi001304576465
 
1.5%
ghi000151115459
 
1.5%
ghi000140573457
 
1.5%
ghi000143720449
 
1.4%
ghi000877574424
 
1.4%
ghi000275983345
 
1.1%
ghi000076584303
 
1.0%
Other values (3817)26175
84.2%

Most occurring characters

ValueCountFrequency (%)
0104660
28.1%
132496
 
8.7%
G31076
 
8.3%
H31076
 
8.3%
I31076
 
8.3%
520657
 
5.5%
419958
 
5.4%
218532
 
5.0%
318498
 
5.0%
718261
 
4.9%
Other values (3)46622
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number279684
75.0%
Uppercase Letter93228
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
0104660
37.4%
132496
 
11.6%
520657
 
7.4%
419958
 
7.1%
218532
 
6.6%
318498
 
6.6%
718261
 
6.5%
817264
 
6.2%
915291
 
5.5%
614067
 
5.0%
ValueCountFrequency (%)
G31076
33.3%
H31076
33.3%
I31076
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common279684
75.0%
Latin93228
 
25.0%

Most frequent character per script

ValueCountFrequency (%)
0104660
37.4%
132496
 
11.6%
520657
 
7.4%
419958
 
7.1%
218532
 
6.6%
318498
 
6.6%
718261
 
6.5%
817264
 
6.2%
915291
 
5.5%
614067
 
5.0%
ValueCountFrequency (%)
G31076
33.3%
H31076
33.3%
I31076
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII372912
100.0%

Most frequent character per block

ValueCountFrequency (%)
0104660
28.1%
132496
 
8.7%
G31076
 
8.3%
H31076
 
8.3%
I31076
 
8.3%
520657
 
5.5%
419958
 
5.4%
218532
 
5.0%
318498
 
5.0%
718261
 
4.9%
Other values (3)46622
12.5%

Reason
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
CR
31076 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters62152
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCR
2nd rowCR
3rd rowCR
4th rowCR
5th rowCR
ValueCountFrequency (%)
CR31076
100.0%
2021-04-27T02:17:17.279770image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T02:17:17.417835image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
cr31076
100.0%

Most occurring characters

ValueCountFrequency (%)
C31076
50.0%
R31076
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter62152
100.0%

Most frequent character per category

ValueCountFrequency (%)
C31076
50.0%
R31076
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin62152
100.0%

Most frequent character per script

ValueCountFrequency (%)
C31076
50.0%
R31076
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII62152
100.0%

Most frequent character per block

ValueCountFrequency (%)
C31076
50.0%
R31076
50.0%

Age
Categorical

HIGH CORRELATION

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
25-29
5227 
30-34
4362 
20-24
4289 
35-39
3607 
40-44
2882 
Other values (8)
10709 

Length

Max length5
Median length5
Mean length4.873664564
Min length2

Characters and Unicode

Total characters151454
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row35-39
2nd row30-34
3rd row50-54
4th row35-39
5th row20-24
ValueCountFrequency (%)
25-295227
16.8%
30-344362
14.0%
20-244289
13.8%
35-393607
11.6%
40-442882
9.3%
45-492676
8.6%
50-542160
7.0%
65+1816
 
5.8%
55-591772
 
5.7%
60-641361
 
4.4%
Other values (3)924
 
3.0%
2021-04-27T02:17:17.854135image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25-295227
16.8%
30-344362
14.0%
20-244289
13.8%
35-393607
11.6%
40-442882
9.3%
45-492676
8.6%
50-542160
7.0%
651816
 
5.8%
55-591772
 
5.7%
60-641361
 
4.4%
Other values (3)924
 
3.0%

Most occurring characters

ValueCountFrequency (%)
-29162
19.3%
426170
17.3%
522962
15.2%
219032
12.6%
315938
10.5%
015054
9.9%
914108
9.3%
64556
 
3.0%
+1816
 
1.2%
11750
 
1.2%
Other values (2)906
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120476
79.5%
Dash Punctuation29162
 
19.3%
Math Symbol1816
 
1.2%

Most frequent character per category

ValueCountFrequency (%)
426170
21.7%
522962
19.1%
219032
15.8%
315938
13.2%
015054
12.5%
914108
11.7%
64556
 
3.8%
11750
 
1.5%
8826
 
0.7%
780
 
0.1%
ValueCountFrequency (%)
-29162
100.0%
ValueCountFrequency (%)
+1816
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common151454
100.0%

Most frequent character per script

ValueCountFrequency (%)
-29162
19.3%
426170
17.3%
522962
15.2%
219032
12.6%
315938
10.5%
015054
9.9%
914108
9.3%
64556
 
3.0%
+1816
 
1.2%
11750
 
1.2%
Other values (2)906
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII151454
100.0%

Most frequent character per block

ValueCountFrequency (%)
-29162
19.3%
426170
17.3%
522962
15.2%
219032
12.6%
315938
10.5%
015054
9.9%
914108
9.3%
64556
 
3.0%
+1816
 
1.2%
11750
 
1.2%
Other values (2)906
 
0.6%

Area
Categorical

HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
AM
12773 
O
4202 
C
2898 
W
2458 
BP
2196 
Other values (6)
6549 

Length

Max length3
Median length2
Mean length1.614364783
Min length1

Characters and Unicode

Total characters50168
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAM
2nd rowO
3rd rowEC
4th rowN
5th rowT
ValueCountFrequency (%)
AM12773
41.1%
O4202
 
13.5%
C2898
 
9.3%
W2458
 
7.9%
BP2196
 
7.1%
NL1323
 
4.3%
S1266
 
4.1%
T1255
 
4.0%
EC1086
 
3.5%
Wlg857
 
2.8%
2021-04-27T02:17:18.358420image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
am12773
41.1%
o4202
 
13.5%
c2898
 
9.3%
w2458
 
7.9%
bp2196
 
7.1%
nl1323
 
4.3%
s1266
 
4.1%
t1255
 
4.0%
ec1086
 
3.5%
wlg857
 
2.8%

Most occurring characters

ValueCountFrequency (%)
A12773
25.5%
M12773
25.5%
O4202
 
8.4%
C3984
 
7.9%
W3315
 
6.6%
B2196
 
4.4%
P2196
 
4.4%
N2085
 
4.2%
L1323
 
2.6%
S1266
 
2.5%
Other values (4)4055
 
8.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter48454
96.6%
Lowercase Letter1714
 
3.4%

Most frequent character per category

ValueCountFrequency (%)
A12773
26.4%
M12773
26.4%
O4202
 
8.7%
C3984
 
8.2%
W3315
 
6.8%
B2196
 
4.5%
P2196
 
4.5%
N2085
 
4.3%
L1323
 
2.7%
S1266
 
2.6%
Other values (2)2341
 
4.8%
ValueCountFrequency (%)
l857
50.0%
g857
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin50168
100.0%

Most frequent character per script

ValueCountFrequency (%)
A12773
25.5%
M12773
25.5%
O4202
 
8.4%
C3984
 
7.9%
W3315
 
6.6%
B2196
 
4.4%
P2196
 
4.4%
N2085
 
4.2%
L1323
 
2.6%
S1266
 
2.5%
Other values (4)4055
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII50168
100.0%

Most frequent character per block

ValueCountFrequency (%)
A12773
25.5%
M12773
25.5%
O4202
 
8.4%
C3984
 
7.9%
W3315
 
6.6%
B2196
 
4.4%
P2196
 
4.4%
N2085
 
4.2%
L1323
 
2.6%
S1266
 
2.5%
Other values (4)4055
 
8.1%

True_False
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
0
30721 
1
 
355

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters31076
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
030721
98.9%
1355
 
1.1%
2021-04-27T02:17:18.855338image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T02:17:18.986926image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
030721
98.9%
1355
 
1.1%

Most occurring characters

ValueCountFrequency (%)
030721
98.9%
1355
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number31076
100.0%

Most frequent character per category

ValueCountFrequency (%)
030721
98.9%
1355
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
Common31076
100.0%

Most frequent character per script

ValueCountFrequency (%)
030721
98.9%
1355
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII31076
100.0%

Most frequent character per block

ValueCountFrequency (%)
030721
98.9%
1355
 
1.1%

AgeGroup
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
MidAge
13527 
Adult
10342 
Old
7109 
Teenage
 
98

Length

Max length7
Median length5
Mean length4.984071309
Min length3

Characters and Unicode

Total characters154885
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMidAge
2nd rowMidAge
3rd rowOld
4th rowMidAge
5th rowAdult
ValueCountFrequency (%)
MidAge13527
43.5%
Adult10342
33.3%
Old7109
22.9%
Teenage98
 
0.3%
2021-04-27T02:17:19.460869image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T02:17:19.663443image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
midage13527
43.5%
adult10342
33.3%
old7109
22.9%
teenage98
 
0.3%

Most occurring characters

ValueCountFrequency (%)
d30978
20.0%
A23869
15.4%
l17451
11.3%
e13821
8.9%
g13625
8.8%
M13527
8.7%
i13527
8.7%
u10342
 
6.7%
t10342
 
6.7%
O7109
 
4.6%
Other values (3)294
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter110282
71.2%
Uppercase Letter44603
28.8%

Most frequent character per category

ValueCountFrequency (%)
d30978
28.1%
l17451
15.8%
e13821
12.5%
g13625
12.4%
i13527
12.3%
u10342
 
9.4%
t10342
 
9.4%
n98
 
0.1%
a98
 
0.1%
ValueCountFrequency (%)
A23869
53.5%
M13527
30.3%
O7109
 
15.9%
T98
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin154885
100.0%

Most frequent character per script

ValueCountFrequency (%)
d30978
20.0%
A23869
15.4%
l17451
11.3%
e13821
8.9%
g13625
8.8%
M13527
8.7%
i13527
8.7%
u10342
 
6.7%
t10342
 
6.7%
O7109
 
4.6%
Other values (3)294
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII154885
100.0%

Most frequent character per block

ValueCountFrequency (%)
d30978
20.0%
A23869
15.4%
l17451
11.3%
e13821
8.9%
g13625
8.8%
M13527
8.7%
i13527
8.7%
u10342
 
6.7%
t10342
 
6.7%
O7109
 
4.6%
Other values (3)294
 
0.2%

Payment_Type
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
AV
26795 
RPU
4281 

Length

Max length3
Median length2
Mean length2.137759042
Min length2

Characters and Unicode

Total characters66433
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRPU
2nd rowAV
3rd rowAV
4th rowRPU
5th rowAV
ValueCountFrequency (%)
AV26795
86.2%
RPU4281
 
13.8%
2021-04-27T02:17:20.076627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T02:17:20.218317image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
av26795
86.2%
rpu4281
 
13.8%

Most occurring characters

ValueCountFrequency (%)
A26795
40.3%
V26795
40.3%
R4281
 
6.4%
P4281
 
6.4%
U4281
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter66433
100.0%

Most frequent character per category

ValueCountFrequency (%)
A26795
40.3%
V26795
40.3%
R4281
 
6.4%
P4281
 
6.4%
U4281
 
6.4%

Most occurring scripts

ValueCountFrequency (%)
Latin66433
100.0%

Most frequent character per script

ValueCountFrequency (%)
A26795
40.3%
V26795
40.3%
R4281
 
6.4%
P4281
 
6.4%
U4281
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII66433
100.0%

Most frequent character per block

ValueCountFrequency (%)
A26795
40.3%
V26795
40.3%
R4281
 
6.4%
P4281
 
6.4%
U4281
 
6.4%

logapplied
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1901
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.345708332
Minimum1.098612289
Maximum9.542661146
Zeros0
Zeros (%)0.0%
Memory size242.9 KiB
2021-04-27T02:17:20.390449image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1.098612289
5-th percentile5.010635294
Q15.799092654
median6.327936784
Q36.956545443
95-th percentile7.60090246
Maximum9.542661146
Range8.444048857
Interquartile range (IQR)1.157452789

Descriptive statistics

Standard deviation0.8117956407
Coefficient of variation (CV)0.1279282939
Kurtosis-0.03917302714
Mean6.345708332
Median Absolute Deviation (MAD)0.5798184953
Skewness-0.2773979633
Sum197199.2321
Variance0.6590121622
MonotocityNot monotonic
2021-04-27T02:17:20.633257image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.991464547579
 
1.9%
5.703782475563
 
1.8%
6.214608098554
 
1.8%
6.396929655548
 
1.8%
6.109247583404
 
1.3%
5.298317367385
 
1.2%
6.551080335384
 
1.2%
5.857933154373
 
1.2%
6.907755279372
 
1.2%
7.090076836346
 
1.1%
Other values (1891)26568
85.5%
ValueCountFrequency (%)
1.0986122891
< 0.1%
1.3862943611
< 0.1%
1.7917594691
< 0.1%
2.3025850931
< 0.1%
2.484906651
< 0.1%
ValueCountFrequency (%)
9.5426611461
< 0.1%
8.8580842221
< 0.1%
8.7096300821
< 0.1%
8.5537179661
< 0.1%
8.5061322441
< 0.1%

logreceived
Real number (ℝ≥0)

HIGH CORRELATION

Distinct4633
Distinct (%)14.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.345670011
Minimum0.955511445
Maximum9.542625283
Zeros0
Zeros (%)0.0%
Memory size242.9 KiB
2021-04-27T02:17:20.879514image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.955511445
5-th percentile5.010635294
Q15.799092654
median6.327936784
Q36.956545443
95-th percentile7.60090246
Maximum9.542625283
Range8.587113838
Interquartile range (IQR)1.157452789

Descriptive statistics

Standard deviation0.8118505925
Coefficient of variation (CV)0.1279377262
Kurtosis-0.03452150853
Mean6.345670011
Median Absolute Deviation (MAD)0.5798184953
Skewness-0.2779963609
Sum197198.0413
Variance0.6591013846
MonotocityNot monotonic
2021-04-27T02:17:21.156594image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.991464547576
 
1.9%
5.703782475559
 
1.8%
6.396929655544
 
1.8%
6.214608098543
 
1.7%
6.109247583399
 
1.3%
6.551080335383
 
1.2%
5.298317367380
 
1.2%
6.907755279370
 
1.2%
5.857933154369
 
1.2%
7.090076836345
 
1.1%
Other values (4623)26608
85.6%
ValueCountFrequency (%)
0.9555114451
< 0.1%
1.4158531631
< 0.1%
1.8115620971
< 0.1%
2.3025850931
< 0.1%
2.484906651
< 0.1%
ValueCountFrequency (%)
9.5426252831
< 0.1%
8.8581354231
< 0.1%
8.7095475841
< 0.1%
8.5537179661
< 0.1%
8.5060857311
< 0.1%

Ratio
Real number (ℝ≥0)

SKEWED

Distinct3273
Distinct (%)10.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9999621512
Minimum0.8666666667
Maximum1.03
Zeros0
Zeros (%)0.0%
Memory size242.9 KiB
2021-04-27T02:17:21.425572image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.8666666667
5-th percentile0.9996148893
Q11
median1
Q31
95-th percentile1.000089955
Maximum1.03
Range0.1633333333
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.0009418376605
Coefficient of variation (CV)0.0009418733093
Kurtosis12987.70877
Mean0.9999621512
Median Absolute Deviation (MAD)0
Skewness-90.54043879
Sum31074.82381
Variance8.870581787 × 107
MonotocityNot monotonic
2021-04-27T02:17:21.678840image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126670
85.8%
0.998263888924
 
0.1%
0.99903474924
 
0.1%
0.998759305221
 
0.1%
0.998826291114
 
< 0.1%
0.999113475214
 
< 0.1%
0.996666666713
 
< 0.1%
1.00006869513
 
< 0.1%
0.999908675812
 
< 0.1%
0.99894067812
 
< 0.1%
Other values (3263)4259
 
13.7%
ValueCountFrequency (%)
0.86666666671
< 0.1%
0.97826086961
< 0.1%
0.98173913041
< 0.1%
0.98416666671
< 0.1%
0.98516129031
< 0.1%
ValueCountFrequency (%)
1.031
< 0.1%
1.022
< 0.1%
1.0170833331
< 0.1%
1.0137142861
< 0.1%
1.0133333331
< 0.1%

Year
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size242.9 KiB
2019
8073 
2018
7626 
2020
7535 
2017
7263 
2016
 
579

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters124304
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2020
3rd row2017
4th row2018
5th row2020
ValueCountFrequency (%)
20198073
26.0%
20187626
24.5%
20207535
24.2%
20177263
23.4%
2016579
 
1.9%
2021-04-27T02:17:22.226512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T02:17:22.370241image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
20198073
26.0%
20187626
24.5%
20207535
24.2%
20177263
23.4%
2016579
 
1.9%

Most occurring characters

ValueCountFrequency (%)
238611
31.1%
038611
31.1%
123541
18.9%
98073
 
6.5%
87626
 
6.1%
77263
 
5.8%
6579
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number124304
100.0%

Most frequent character per category

ValueCountFrequency (%)
238611
31.1%
038611
31.1%
123541
18.9%
98073
 
6.5%
87626
 
6.1%
77263
 
5.8%
6579
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common124304
100.0%

Most frequent character per script

ValueCountFrequency (%)
238611
31.1%
038611
31.1%
123541
18.9%
98073
 
6.5%
87626
 
6.1%
77263
 
5.8%
6579
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII124304
100.0%

Most frequent character per block

ValueCountFrequency (%)
238611
31.1%
038611
31.1%
123541
18.9%
98073
 
6.5%
87626
 
6.1%
77263
 
5.8%
6579
 
0.5%

Month
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.731078646
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size242.9 KiB
2021-04-27T02:17:22.751615image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.363299266
Coefficient of variation (CV)0.4996672069
Kurtosis-1.15003414
Mean6.731078646
Median Absolute Deviation (MAD)3
Skewness-0.1005964716
Sum209175
Variance11.31178195
MonotocityNot monotonic
2021-04-27T02:17:22.983619image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
82927
9.4%
112907
9.4%
72885
9.3%
52775
8.9%
92771
8.9%
62724
8.8%
102669
8.6%
32508
8.1%
122467
7.9%
22423
7.8%
Other values (2)4020
12.9%
ValueCountFrequency (%)
12105
6.8%
22423
7.8%
32508
8.1%
41915
6.2%
52775
8.9%
ValueCountFrequency (%)
122467
7.9%
112907
9.4%
102669
8.6%
92771
8.9%
82927
9.4%

Week
Real number (ℝ≥0)

HIGH CORRELATION

Distinct52
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.61755052
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Memory size242.9 KiB
2021-04-27T02:17:23.325172image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q115
median28
Q340
95-th percentile50
Maximum52
Range51
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.62574856
Coefficient of variation (CV)0.5295816712
Kurtosis-1.168248914
Mean27.61755052
Median Absolute Deviation (MAD)12
Skewness-0.09976652774
Sum858243
Variance213.9125209
MonotocityNot monotonic
2021-04-27T02:17:23.575111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
48776
 
2.5%
49743
 
2.4%
51715
 
2.3%
38705
 
2.3%
35703
 
2.3%
50698
 
2.2%
46690
 
2.2%
24690
 
2.2%
27685
 
2.2%
34683
 
2.2%
Other values (42)23988
77.2%
ValueCountFrequency (%)
1214
 
0.7%
2540
1.7%
3542
1.7%
4568
1.8%
5500
1.6%
ValueCountFrequency (%)
52216
 
0.7%
51715
2.3%
50698
2.2%
49743
2.4%
48776
2.5%

Day
Real number (ℝ≥0)

Distinct31
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.84985198
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size242.9 KiB
2021-04-27T02:17:23.825051image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q19
median16
Q323
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.601931361
Coefficient of variation (CV)0.5427136717
Kurtosis-1.133381584
Mean15.84985198
Median Absolute Deviation (MAD)7
Skewness0.003603339528
Sum492550
Variance73.99322315
MonotocityNot monotonic
2021-04-27T02:17:24.028129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
201248
 
4.0%
131198
 
3.9%
171170
 
3.8%
211117
 
3.6%
111106
 
3.6%
121090
 
3.5%
191082
 
3.5%
51055
 
3.4%
161042
 
3.4%
181032
 
3.3%
Other values (21)19936
64.2%
ValueCountFrequency (%)
1920
3.0%
2866
2.8%
3908
2.9%
4912
2.9%
51055
3.4%
ValueCountFrequency (%)
31573
1.8%
30991
3.2%
29889
2.9%
28969
3.1%
271017
3.3%

Dayofweek
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.100752993
Minimum0
Maximum5
Zeros5491
Zeros (%)17.7%
Memory size242.9 KiB
2021-04-27T02:17:24.231206image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile4
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.419840363
Coefficient of variation (CV)0.6758721127
Kurtosis-1.246559229
Mean2.100752993
Median Absolute Deviation (MAD)1
Skewness-0.04535549859
Sum65283
Variance2.015946657
MonotocityNot monotonic
2021-04-27T02:17:24.418660image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
46578
21.2%
36501
20.9%
16204
20.0%
26082
19.6%
05491
17.7%
5220
 
0.7%
ValueCountFrequency (%)
05491
17.7%
16204
20.0%
26082
19.6%
36501
20.9%
46578
21.2%
ValueCountFrequency (%)
5220
 
0.7%
46578
21.2%
36501
20.9%
26082
19.6%
16204
20.0%

Dayofyear
Real number (ℝ≥0)

HIGH CORRELATION

Distinct359
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean189.5892972
Minimum3
Maximum365
Zeros0
Zeros (%)0.0%
Memory size242.9 KiB
2021-04-27T02:17:24.637361image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile24
Q1101
median193
Q3276
95-th percentile345
Maximum365
Range362
Interquartile range (IQR)175

Descriptive statistics

Standard deviation102.4193942
Coefficient of variation (CV)0.5402171729
Kurtosis-1.163665309
Mean189.5892972
Median Absolute Deviation (MAD)87
Skewness-0.09383894566
Sum5891677
Variance10489.7323
MonotocityNot monotonic
2021-04-27T02:17:24.902924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
331157
 
0.5%
354148
 
0.5%
318147
 
0.5%
241147
 
0.5%
324146
 
0.5%
73144
 
0.5%
52143
 
0.5%
150141
 
0.5%
51141
 
0.5%
269140
 
0.5%
Other values (349)29622
95.3%
ValueCountFrequency (%)
340
0.1%
456
0.2%
530
0.1%
651
0.2%
751
0.2%
ValueCountFrequency (%)
36547
0.2%
36440
0.1%
36321
0.1%
36234
0.1%
36147
0.2%

Is_month_end
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size30.5 KiB
False
30048 
True
 
1028
ValueCountFrequency (%)
False30048
96.7%
True1028
 
3.3%
2021-04-27T02:17:25.090408image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_month_start
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size30.5 KiB
False
30156 
True
 
920
ValueCountFrequency (%)
False30156
97.0%
True920
 
3.0%
2021-04-27T02:17:25.184137image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_quarter_end
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size30.5 KiB
False
30843 
True
 
233
ValueCountFrequency (%)
False30843
99.3%
True233
 
0.7%
2021-04-27T02:17:25.262217image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_quarter_start
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size30.5 KiB
False
30860 
True
 
216
ValueCountFrequency (%)
False30860
99.3%
True216
 
0.7%
2021-04-27T02:17:25.355939image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_year_end
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size30.5 KiB
False
31047 
True
 
29
ValueCountFrequency (%)
False31047
99.9%
True29
 
0.1%
2021-04-27T02:17:25.449667image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_year_start
Boolean

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size30.5 KiB
False
31076 
ValueCountFrequency (%)
False31076
100.0%
2021-04-27T02:17:25.543395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Elapsed
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1063
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1545997227
Minimum1480550400
Maximum1606694400
Zeros0
Zeros (%)0.0%
Memory size242.9 KiB
2021-04-27T02:17:25.683986image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1480550400
5-th percentile1487548800
Q11513900800
median1546905600
Q31576627200
95-th percentile1601942400
Maximum1606694400
Range126144000
Interquartile range (IQR)62726400

Descriptive statistics

Standard deviation36700548.46
Coefficient of variation (CV)0.02373907781
Kurtosis-1.180712162
Mean1545997227
Median Absolute Deviation (MAD)31190400
Skewness-0.05312932608
Sum4.804340982 × 1013
Variance1.346930257 × 1015
MonotocityNot monotonic
2021-04-27T02:17:26.215112image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
159710400062
 
0.2%
159347520061
 
0.2%
160496640058
 
0.2%
160643520056
 
0.2%
148720320056
 
0.2%
148728960055
 
0.2%
157602240054
 
0.2%
154293120054
 
0.2%
151320960054
 
0.2%
157317120054
 
0.2%
Other values (1053)30512
98.2%
ValueCountFrequency (%)
148055040028
0.1%
148063680026
0.1%
148089600032
0.1%
148098240021
0.1%
148106880030
0.1%
ValueCountFrequency (%)
160669440047
0.2%
16065216006
 
< 0.1%
160643520056
0.2%
160634880045
0.1%
160626240037
0.1%

Interactions

2021-04-27T02:16:38.948482image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:39.351041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:39.585392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:39.850919image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:40.132105image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:40.382079image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:40.663223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:40.946143image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:41.196091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:41.461611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:41.763745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:42.039590image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:42.273903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:42.520437image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:42.967104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:43.232664image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:43.510072image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:43.780063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:44.058285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:44.323845image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:44.647397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:44.881164image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:45.115486image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:45.318562image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:45.595175image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:45.798249image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:46.047303image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:46.262867image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:46.481566image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:46.738012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:46.972329image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:47.191057image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:47.425350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:47.644042image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:47.878370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:48.097096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:48.300143image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:48.550084image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:48.768809image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:49.018755image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:49.268692image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:49.558951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:49.862038image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:50.111978image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:50.361890image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:50.627453image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:50.893013image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:51.283579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:51.549109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:51.845914image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:52.127120image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:52.377075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:52.611358image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:52.845712image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:53.048753image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:53.314316image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:53.548637image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:53.782951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:54.017302image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:54.267216image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:54.501535image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:54.786844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:55.036813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:55.239891image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:55.458559image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:55.724118image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:55.958472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:56.192759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:56.411488image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:56.661428image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:56.911340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:57.161280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:57.426842image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:57.645539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:57.879860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:58.145450image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:58.379774image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:58.614094image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:58.848413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:59.098353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:59.363884image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:59.613820image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:16:59.863766image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:00.098085image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:00.316778image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:00.566751image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:00.957285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:01.175989image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:01.410273image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:01.660214image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:01.910157image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:02.175716image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:02.441312image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:02.713368image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:02.963800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:03.273261image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:03.557435image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:03.902954image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:04.228780image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:04.513848image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:04.803647image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:05.121707image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:05.442324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:05.726412image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:05.996670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:06.362373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:06.658451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:06.925097image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:07.229791image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T02:17:07.532691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-04-27T02:17:26.527537image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-04-27T02:17:27.183631image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-04-27T02:17:27.777274image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-04-27T02:17:28.370881image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-04-27T02:17:28.980111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-04-27T02:17:08.286384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-04-27T02:17:11.655392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

AppliedGenderPayment_MethodLocationReceivedIdReasonAgeAreaTrue_FalseAgeGroupPayment_TypelogappliedlogreceivedRatioYearMonthWeekDayDayofweekDayofyearIs_month_endIs_month_startIs_quarter_endIs_quarter_startIs_year_endIs_year_startElapsed
0510.0FRPNE510.00GHI000076584CR35-39AM0MidAgeRPU6.2344116.2344111.000000201931111070FalseFalseFalseFalseFalseFalse1552262400
1252.0MAVPP252.00GHI000135471CR30-34O0MidAgeAV5.5294295.5294291.00000020204158299FalseFalseFalseFalseFalseFalse1586304000
2140.0MAVM140.00GHI000159249CR50-54EC0OldAV4.9416424.9416421.0000002017938213264FalseFalseFalseFalseFalseFalse1505952000
3380.0MRPNE380.00GHI000844291CR35-39N0MidAgeRPU5.9401715.9401711.000000201872720183FalseFalseFalseFalseFalseFalse1530489600
4320.0FAVNE320.00GHI000441861CR20-24T0AdultAV5.7683215.7683211.0000002020418303121TrueFalseFalseFalseFalseFalse1588204800
51000.0FAVPP1000.00GHI000153867CR20-24AM0AdultAV6.9077556.9077551.00000020181042194292FalseFalseFalseFalseFalseFalse1539907200
6310.0MAVNE310.00GHI000079891CR20-24BP0AdultAV5.7365725.7365721.000000201731220079FalseFalseFalseFalseFalseFalse1489968000
7640.0FAVM640.00GHI001674496CR20-24O0AdultAV6.4614686.4614681.0000002019264035FalseFalseFalseFalseFalseFalse1549238400
8349.0MAVPP349.03GHI000818689CR60-64AM0OldAV5.8550725.8551581.0000862018104021275FalseFalseFalseFalseFalseFalse1538438400
9320.0FAVNE320.00GHI001867266CR30-34BP0MidAgeAV5.7683215.7683211.00000020201315215FalseFalseFalseFalseFalseFalse1579046400

Last rows

AppliedGenderPayment_MethodLocationReceivedIdReasonAgeAreaTrue_FalseAgeGroupPayment_TypelogappliedlogreceivedRatioYearMonthWeekDayDayofweekDayofyearIs_month_endIs_month_startIs_quarter_endIs_quarter_startIs_year_endIs_year_startElapsed
31066300.0FAVNE300.00GHI000219698CR35-39O0MidAgeAV5.7037825.7037821.000000202083241217FalseFalseFalseFalseFalseFalse1596499200
310671050.0FRPPP1050.00GHI000141841CR40-44AM0MidAgeRPU6.9565456.9565451.00000020191146121316FalseFalseFalseFalseFalseFalse1573516800
31068720.0FAVM720.00GHI000170760CR20-24AM0AdultAV6.5792516.5792511.000000201793662249FalseFalseFalseFalseFalseFalse1504656000
31069360.0FAVNE360.00GHI000222795CR25-29C0AdultAV5.8861045.8861041.00000020171041123285FalseFalseFalseFalseFalseFalse1507766400
31070640.0FAVNE640.00GHI000077887CR25-29BP0AdultAV6.4614686.4614681.0000002017417273117FalseFalseFalseFalseFalseFalse1493251200
310711520.0FAVNE1520.00GHI000140573CR50-54AM0OldAV7.3264667.3264661.0000002018731300211FalseFalseFalseFalseFalseFalse1532908800
31072721.0FAVNE721.00GHI000184125CR30-34C0MidAgeAV6.5806396.5806391.00000020174144194FalseFalseFalseFalseFalseFalse1491264000
31073640.0MRPM640.00GHI000084252CR50-54AM0OldRPU6.4614686.4614681.000000201972891190FalseFalseFalseFalseFalseFalse1562630400
31074380.0MAVM380.01GHI000083247CR65+O0OldAV5.9401715.9401981.000026201812919FalseFalseFalseFalseFalseFalse1515456000
31075800.0MRPNE800.00GHI000088447CR35-39W0MidAgeRPU6.6846126.6846121.00000020181211311FalseFalseFalseFalseFalseFalse1515628800

Duplicate rows

Most frequent

AppliedGenderPayment_MethodLocationReceivedIdReasonAgeAreaTrue_FalseAgeGroupPayment_TypelogappliedlogreceivedRatioYearMonthWeekDayDayofweekDayofyearIs_month_endIs_month_startIs_quarter_endIs_quarter_startIs_year_endIs_year_startElapsedcount
1106.0MAVM106.0GHI000135471CR45-49O0MidAgeAV4.6634394.6634391.02017624164167FalseFalseFalseFalseFalseFalse14975712003
5125.0MAVM125.0GHI000137735CR18-19S0AdultAV4.8283144.8283141.02020728104192FalseFalseFalseFalseFalseFalse15943392003
182315.0MAVM315.0GHI000250631CR25-29AM0AdultAV5.7525735.7525731.020201146112316FalseFalseFalseFalseFalseFalse16050528003
214350.0FAVPP350.0GHI001579782CR30-34AM0MidAgeAV5.8579335.8579331.0201962374158FalseFalseFalseFalseFalseFalse15598656003
229360.0FAVM360.0GHI001297159CR40-44W0MidAgeAV5.8861045.8861041.02017114412305FalseTrueFalseFalseFalseFalse15094944003
585900.0MAVNE900.0GHI000084252CR20-24AM0AdultAV6.8023956.8023951.0201811323FalseFalseFalseFalseFalseFalse15149376003
088.0FAVM88.0GHI001755649CR30-34AM0MidAgeAV4.4773374.4773371.0202062480160FalseFalseFalseFalseFalseFalse15915744002
2108.0MAVM108.0GHI000135471CR30-34O0MidAgeAV4.6821314.6821311.020191251204354FalseFalseFalseFalseFalseFalse15768000002
3108.0MAVM108.0GHI000135471CR40-44O0MidAgeAV4.6821314.6821311.02019937123255FalseFalseFalseFalseFalseFalse15682464002
4125.0FAVM125.0GHI000620301CR45-49BP0MidAgeAV4.8283144.8283141.02018728123193FalseFalseFalseFalseFalseFalse15313536002